Dependent Things

In real life, things depend on each other.

Say you can be born smart or dumb and for the sake of simplicity, let's assume whether you're smart or dumb is just nature's flip of a coin. Now whether you become a professor at Standford is non-entirely independent. I would argue becoming a professor in Standford is generally not very likely, so probability might be 0.001 but it also depends on whether you're born smart or dumb. If you are born smart the probability might be larger, whereas if you're born dumb, the probability might be marked more smaller.

Now this just is an example, but if you can think of the most two consecutive coin flips. The first is whether you are born smart or dumb. The second is whether you get a job on a certain time. And now if we take them in these two coin flips, they are not independent anymore. So whereas in our last unit, we assumed that the coin flips were independent, that is, the outcome of the first didn't affect the outcome of the second. From now on, we're going to study the more interesting cases where the outcome of the first does impact the outcome of the second, and to do so you need to use more variables to express these cases.

Cancer Example

Question 1

To do so, let's study a medical example--supposed there's a patient in the hospital who might suffer from a medical condition like cancer. Let's say the probability of having this cancer is 0.1. That means you can tell me what's the probability of being cancer free.

Answer

The answer is 0.9 with just 1 minus the cancer.

Question 2

Of course, in reality, we don't know whether a person suffers cancer, but we can run a test like a blood test. The outcome of it blood test may be positive or negative, but like any good test, it tells me something about the thing I really care about--whether the person has cancer or not.

Let's say, if the person has the cancer, the test comes up positive with the probability of 0.9, and that implies if the person has cancer, the negative outcome will have 0.1 probability and that's because these two things have to add to 1.

I've just given you a fairly complicated notation that says the outcome of the test depends on whether the person has cancer or not. We call this thing over here a conditional probability, and the way to understand this is a very funny notation. There's a bar in the middle, and the bar says what's the probability of the stuff on the left given that we assume the stuff on the right is actually the case.

Now, in reality, we don't know whether the person has cancer or not, and in a later unit, we're going to reason about whether the person has cancer given a certain data set, but for now, we assume we have god-like capabilities. We can tell with absolute certainty that the person has cancer, and we can determine what the outcome of the test is. This is a test that isn't exactly deterministic--it makes mistakes, but it only makes a mistake in 10% of the cases, as illustrated by the 0.1 down here.

Now, it turns out, I haven't fully specified the test. The same test might also be applied to a situation where the person does not have cancer. So this little thing over here is my shortcut of not having cancer. And now, let me say the probability of the test giving me a positive results--a false positive result when there's no cancer is 0.2. You can now tell me what's the probability of a negative outcome in case we know for a fact the person doesn't have cancer, so please tell me.

Answer

And the answer is 0.8. As I'm sure you noticed in the case where there is cancer, the possible test outcomes add up to 1. In the where there isn't cancer, the possible test outcomes add up to 1. So 1 - 0.2 = 0.8.